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We study the impact of synchronous and asynchronous monitoring instrumentation on runtime over¬ 
heads in the context of a runtime verification framework for actor-based systems. We show that, in 
such a context, asynchronous monitoring incurs substantially lower overhead costs. We also show 
how, for certain properties that require synchronous monitoring, a hybrid approach can be used that 
ensures timely violation detections for the important events while, at the same time, incurring lower 
overhead costs that are closer to those of an asynchronous instrumentation. 

1 Introduction 

Formally ensuring the correctness of component-based, concurrent systems is an arduous task, mainly 
because exhaustive methods such as model-checking quickly run into state-explosion problems; this is 
typically caused by the multiple thread interleavings of the system being analysed, and the range of data 
the system can input and react to. Runtime Verification (RV) Il33il is an appealing compromise towards 
ensuring correctness, as it circumvents such scalability issues by verifying only the current system execu¬ 
tion. Runtime Enforcement (RE) Gill builds on RV by automating recovery procedures once a violation 
is detected so as to mitigate or rectify the effects of the violation. Together, these runtime techniques can 
be used as a disciplined methodology for augmenting systems with self-adaptation functionality. 

Most RV and RE frameworks work by synthesising monitors from properties specified in terms of a 
formal language, and then execute these monitors in tandem with the system. Onlin^j] monitoring, i.e., 
the runtime analysis of a system from the partial execution trace generated thus far, usually comes in two 
instrumentation flavours. In synchronous runtime monitoring, system trace events are forwarded to the 
monitor while the system is paused, waiting for the monitor to acknowledge back before it can continue 
executing (until the next trace event is generated). By contrast, in asynchronous monitoring, the system 
does not pause when trace events are generated; instead, events are kept in a buffer and processed by the 
monitor at some later stage, thereby decoupling the execution of the system from that of the monitor. 

Both forms of instrumentation have their merits. Synchronous monitoring guarantees timely detec¬ 
tion of property violations/satisfaction since the system and monitor execute in lockstep; this facilitates 
the runtime enforcement of properties, where remedial action can be promptly applied to a system wait¬ 
ing for a monitor response. Although asynchronous monitoring may lead to late detections, it is less 
intrusive than its synchronous counterpart. Its instrumentation is easier to carry out and does not neces¬ 
sarily require access to the system code prior to execution. The associated overheads are also assumed 
to be lower than its synchronous counterpart, and it is more of a natural fit for settings with inherent no¬ 
tions of asynchrony, such as in distributed systems. Asynchronous monitoring also poses a lower risk of 
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compromising system behaviour in the eventual case of an erroneous monitoring algorithm that forgets 
to acknowledge back or diverges (i.e., enters an infinite internal loop) since, in asynchronous monitoring 
the system execution is decoupled from that of the monitor. 

Issues relating monitor instrumentation are particularly relevant to actor-based ifl!35],[5[ component 
systems. Synthesising asynchronous monitors as actors is in accordance with the actor model of compu¬ 
tation, requiring independent computing entities to execute in decoupled fashion so as to permit scalable 
coding techniques such as fail-fast design patterns Q; such code organisations (using monitor actors 
called supervisors ) are prevalent in actor based languages and technologies such as Erlang 0E1, Scala 
1271 and AKKA ICQ. However, there are cases where tighter analyses through synchronous monitoring 
may be required, particularly when timely detections improve the effectiveness of subsequent recovery 
procedures. Crucially, the appropriate monitor instrumentation needs also to incur low runtime overheads 
for it to be viable. 

In this paper we investigate issues related to monitor instrumentation in actor-based component sys¬ 
tems constructed using Erlang, a mature, industry-strength language used to build fault-tolerant, self- 
adaptive systems with high degrees of resilience Q- As a representative Erlang system for our experi¬ 
ments we consider Yaws I43lf30 1. a high performance HTTP Webserver that makes extensive use of actors 
to handle multiple concurrent client connections. In order to show how this study can be used to verify 
and enforce correct behaviour of Erlang systems, we also employ detectEr If23lf24 1. a component-based 
RV tool designed to monitor for Erlang safety properties. Within this setting: 

1. we design and implement mechanisms for synchronous instrumentation ; asynchronous monitoring 
is natively supported through mechanisms offered by the Erlang Virtual Machine (EVM) — these 
are presently employed by detectEr, but also other tools such as lU4li20 1. 

2. we quantify the relative computational overhead incurred by synchronous and asynchronous mon¬ 
itoring; it is generally accepted that asynchronous monitoring incurs less overheads, but we are not 
aware of any studies that attempt to quantify by how much. 

3. we devise a novel hybrid instrumentation technique that guarantees timely detections while incur¬ 
ring overheads comparable to those of asynchronous monitoring. We also asses the effectiveness 
of the technique wrt. the other methods of instrumentation. 

4. we integrate the different modes of instrumentation within detectEr, allowing one to specify cor¬ 
rectness properties with multiple violation conditions where some violations are monitored asyn¬ 
chronously, some are monitored synchronously, and for others monitoring switches between syn¬ 
chronous and asynchronous modes. Although tools for synchronous and asynchronous monitoring 
exists, we arc not aware of any that combine the two within the same monitor execution. 

Although the work in this paper focusses on detection, it lays the necessary foundations for implementing 
effective enforcement mechanisms that are launched as soon as violations are detected, allowing us to 
augments systems with self-adaptive functionality. 

The rest of the paper is structured as follows. Sec. [2]briefly outlines actors in Erlang and introduces 
Yaws. Sec. [3] describes the logic used by the tool detectEr and shows how this logic can specify safety 
properties for Yaws. In Sec. [4| we devise a technique for instrumenting synchronous monitors for an 
actor system and asses the relative overhead costs compared to the existing asynchronous instrumen¬ 
tation. Sec. [5] presents a novel hybrid setup where one can specify which events are to be monitored 
synchronously or asynchronously. We asses the overheads incurred by the new instrumentation tech¬ 
nique in Sec. [6] Sec.[7]concludes by outlining future and related work. 
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(1) connect 


(2) { HandlerPid, next, ClientPort} 
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(4) HTTP requests 
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1 New Handler ! 
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Figure 1: Yaws client connection protocol 


2 Erlang Actors and the Yaws Webserver 

Erlang Q is an actor-based (51 programming language where processes (i.e., actors) are threads of ex¬ 
ecution that are uniquely identified by a process identifier and own their own local memory. Erlang 
processes execute asynchronously to one another, interacting through asynchronous messages : instead 
of sharing data, processes explicitly send a copy of this data to the destination process (using the unique 
identifier as address); messages are received at a process mailbox (a form of message buffer) and can 
be exclusively read at a later stage by the process owning the mailbox. Processes may also spawn other 
processes at runtime, and communicate locally-known (i.e., private) process identifiers as messages. 
Implementation wise. Erlang processes arc relatively lightweight, and language coding practices recom¬ 
mend the use of concurrent processes wherever possible. These factors ultimately lead to systems made 
up of independently-executing components that are more scalable, maintainable, and use the underlying 
parallel architectures more efficiently j9j. 

At its core, Erlang is dynamically-typed, and function calls with mismatching parameters are typ¬ 
ically detected at runtime. Function definitions are named and organised in uniquely-named modules. 
Within a particular module, there can be more than one function with the same name, as long as these 
names have different arities (i.e., number of parameters). Thus, every function is uniquely identified by 
the triple module Jiame:function Jiame:arity. 

Erlang offers a number of fault-tolerance mechanisms. Since asynchronous messages may reach a 
mailbox in a different order than the one intended, the (mailbox) read construct uses pattern matching 
to allow a process to retrieve the first message in the mailbox matching a pattern, possibly reading 
messages out of order. Erlang also offers a mechanism for process linking, whereby a process receives 
a notification message in its mailbox when a linked process fails i.e., terminates abnormally. Erlang 
systems are often structured by the supervisor pattern which is built on linking: processes at the system 
fringes are encouraged to fail-fast when an error occurs (as opposed to handling the error locally), leaving 
error handling to linked supervisor processes Q. 

2.1 YAWS: A Webserver written in Erlang 

Yaws(43j |30l is a high-performance, component-based HTTP Webserver written in Erlang. For every 
client connection, the server assigns a dedicated (concurrent) handler that services HTTP client requests, 
thereby parallelising processing for multiple clients. Its implementation relies on the lightweight nature 
of Erlang processes to efficiently handle a vast number of simultaneous client connections. 

The Yaws protocol for establishing client connections is depicted in Fig. [T] This protocol uses an 
acceptor component which, upon creation, spawns a connection handler to be assigned to the next client 
connection. Subsequently, the acceptor blocks waiting for messages in its mailbox, while the unassigned 
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<£,(Ae FRM ::= ff | (p&f | [rr]^ | X | max(X,^) | boolZr=><^ 


im = 0 

hpWl I = M n M 


def / P \ 

[ [ff] A | (A => B and match(a,/?) = crj implies B e |[</?cr]] 

[max(X,^)3 = 1J{5 | Sc[#hS(]| 


[[bool 6 ->ipj 



if b U. true 
if b U. false 


Figure 2: The Logic and its Semantics 


handler waits for the next TCP connection request. Clients send connection requests through standard 
TCP ports (1), which are received as messages in the handler ’s mailbox. The current handler accepts 
these requests by reading the resp. message from its mailbox and (2) sending a message containing its 
own pid and the port of the connected client to the acceptor, this acts as a notification that it is now 
engaged in handling the connection of a specific client. Upon receiving the message, the acceptor un¬ 
blocks, records the information sent by the handler for supervision purposes (e.g., restarting the handler 
in case it crashes) and (3) spawns a new handler listening for future connection requests. 

Once it is assigned a handler, the connected client then engages directly with it using (4) standard 
HTTP requests; these normally consist of six (or more) HTTP headers containing the information such 
as the client’s User Agent, Accept-Encoding and the Keep-Alive flag status. HTTP request information 
is not sent in one go but follows a protocol of messages: it starts by sending the http_req, followed by 
six httpJieader messages containing client information, terminated by a final http.eoh message. The 
dedicated connection handler inspects the request and services the resp. HTTP request accordingly. 


3 The Logic 


We conduct our investigations using the component-based runtime verification tool called detectEr. In 
ll23l . the authors present a tool that synthesises concurrent monitors (as systems of Erlang processes) 
from a syntactic subset of the modal p-calculus specifying safety properties for Erlang systems; the 
sublogic is called sHML [O. In ||23l , these monitors asynchronously analyse the system so as to verify 
for runtime violations of the respective formulas. Actor based systems such as those constructed using 
Erlang typically grown and shrink in size as computation progresses^] Accordingly, the component-based 
monitors generated by detectEr are able to scale with the current size of the system being monitored. 

The syntax of our logic is defined inductively using the BNF description in Fig. [2] The logic is an 
extension to that of If23l . facilitating the expression of properties dealing with data. It is parametrised 
by a set of boolean expressions, b,c e Bool, equipped with a decidable evaluation function, b JJ. v where 
v 6 {true,false), and a set of actions a,/3 e Act that may universally quantify over data values. It assumes 


2 For instance, the Yaws Webserver of Sec. 


2.1 


creates a new handler component for every new client request. 
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two distinct denumerable sets of term variables, x,y, ... e Var, used in actions and boolean expressions, 
and formula variables X,Y,... e LVar, used to define recursive logical formulas; in what follows, we 
work up-to cr-conversion of bound variables]^] Formulas include falsity, ff, conjunctions, cp&t/r, modal 
necessities, \_a~\p, and maximal fixpoints max IX, p) from [231 . One important extension to [23l is that 
actions, a, used in necessity formulas, [a] p, may contain term variables that pattern-match with actual 
(closed) actions. We also extend the logic with boolean guards, bool/? =><p, that may contain term 
variables introduced by necessity formulas. A formula \_a~\p is thus a binder for the variables used in a 
in the subformula p\ similarly max (X ,<p") is a binder for X in <p. To improve readability, we sometimes 
denote term variables introduced by a in [a] ip that are not used in the subformula <p as underscores, _. 

The semantics of the logic is defined over an arbitrary labelled transition system (LTS), (Actr, ActU 
{t},—»), where A,B e Actr is the set of nodes denoting actor systems, ActU{t) is a set of actions 
including a silent (internal) action r, and —» is a ternary relation of type Actrx (ActU {t}) x Actr. In 
[23], the authors show how Erlang programs can be given an LTS semantics and, in this paper, we 
conveniently adopt that semantics. In practice, however, other LTS semantics for the language, such as 
B01I25 1 can be used instead. We write A —> B (resp. A B) in lieu of ( A,a,B ) e—> ( resp. ( A,t,B ) e—>) 
and write A => B to denote A(—>)*• —» •(— >)*B. 

Our logic semantics is presented in Lig.[2j through the denotational function [[- ]| :: FRM —» !P(Actr), 
defined inductively on the structure of the formula. The definition assumes well-formed formulas i.e., 
formulas where all variables are bound and formula variables are guarded (appear under a necessity for¬ 
mula). We say A satisfies p, denoted as A |= <p, whenever A e Hyp]]. The semantics follows that of l23l . No 
actor system satisfies ff, whereas actors satisfying p&fi must satisfy both ip and fi. In our extension, the 
necessity formula is imbued with pattern-matching functionality, represented by the function match (a,/3) 
matching a (possibly) open action a (i.e., with term variables in it) with a closed action /j, returning a 
substitution, cr :: Var —*■ Val, whenever successful. 


match( server I {x, ack,yj, server 

! {5,ack, joe}) 

= |ih 5 ,y 1 —* joe} 

(1) 

match(server ! {5,ack, joe},server 

! {5,ack, joe}) 

= » 

(2) 

match(client I {x, ack,y}, server 

! {5,ack, joe}) 

is undefined 

(3) 

match( server ? (x, ack, y}, server 

! {5,ack, joe}) 

is undefined 

(4) 


Lor example, in Q the open output action server ! {x, ack,y} is successfully matched with the output 
action server ! {5,ack, joe}, where x andy are pattern matched with the values 5 and joe resp. In ([2]) the 
two closed actions are matched (exactly), returning the empty substitution. The mismatch in ([3]) is due to 
mismatching destinations of the output actions i.e., server versus client, whereas the mismatch in ([4]) 
is because the input action server ? (x, ack,y} cannot be pattern matched with actions of a different kind 
(e.g., output). Necessity formulas \_a~\p are satisfied by all actor systems A observing the condition that, 
whenever pattern-matchable actions /? are performed (yielding substitution cr), the resulting actors B that 
are transitioned to must satisfy per. Note that actors that do not perform any pattern-matchable actions 
trivially satisfy [a] <p. Lormula max (X, p) denotes the maximal fixpoint of the functional |[^]|; following 
standard fixpoint theory [42] . this is characterised as the union of all post-fixpoints S e |P(Actr). Guarded 
formulas bool b ->p equate to p the whenever b evaluates to true but are trivially satisfied whenever b 
evaluates to false. 

Example 3.1. Consider the formula 

max(X, [server?{succ,r,y}] [viz] ((bool(z = x+l) =>X) & (bool(z + x+\) =>ff ))) (5) 

3 detectEr renames duplicate variables accordingly during a pre-processing phase. 
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It states that a satisfying system repeatedly observes the condition, i.e.. max(X,...), that whenever 
it accepts an input at actor server with values matching the pattern {succ, x,y} — denoting a server 
request from client y for a successor computation, the succ message tag, for value x — followed by a 
reply output message sent to y with the answer z, then the answer is indeed an increment on value x. ■ 
Remark 3.1. Apart from input and output actions from the original tool presentation of Il23il . our logic 
extension also considers actions denoting function calls and returns, call (Pid, {module, function, values }) 
and ret (Pid, {module, function, arity, values}) resp. These are needed in actual Erlang implementations 
because certain output and input actions may be abstracted away inside the function calls of system 
libraries, making them (directly) unobservable to the instrumentation mechanism. However, there are 
cases where we can still observe these actions (and the data associated with them) indirectly, through the 
calls and returns of the functions that abstract them. 

3.1 Monitoring for safety properties in Yaws 

The logic is expressive enough to express a number of safety properties for Yaws, including the Directory 
Traversal Vulnerability found in earlier versions of the software and reported on the reputable exploit-db 
website |[29l . We here discuss a second safety property for the Webserver. 

If we assume the existence of a (decidable) predicate called isMaliciousQ, which can determine 
whether the client will engage in security-breaching activity from the 6 HTTP headers sent to the han¬ 
dler, we can use the logic of Fig. [2] to specify the safety property stated in ([6]). The property requires 
that, for every client connection established (determined from message (2) of Fig. [T] and denoted in 
property (|6|) as the action AcceptorPid ! {handID, next,.}) the following subproperty must hold: for 
every HTTP request, the respective headers communicated (hi to h6) do not amount to a potentially 
security-breaching request (as determined by the predicate isMaliciousQ). 

max (X, 

[AcceptorPid ! {handID, next, _}] 


(x & 


max (Y , [ret( handID, [yaws,do_recv, 3, [ok, {http_req,GET, _,_}}})] 

[ret (handID, [yaws,do_recv, 3, jok,/H}})] [ret (handID, [yaws,do_recv, 3,{ok,/i2}})] 

[ret (handID, [yaws, do_recv, 3, {ok,/?3}}) ] [ret (handID, [yaws, do_recv, 3, {ok,M}}) ] (6) 

[ret (handID, [yaws, do_recv, 3, {ok,/?5}}) ] [ret (handID, {yaws, do_recv, 3, {ok,/i6}}) ] 
(bool(isMalicious(hl,h2,h3,h4,h5,h6))->f£) 

&(bool (MsMalicious(h\,h2,K5,hA,h5,h6)) => 


[ret (handID, {yaws, do_recv, 3, {ok,http_eoh}}) ] T)) 



The logical formula stated in (|6} specifies this property by using two nested recursive formulas. The outer 
one, max(X, ...), refers to each assigned handler by pattern-matching with the term variable handID, 
whereas the inner one, max(T, ...), uses this value to pattern match with the header term variables, 
hi to h6, for every iteration of the HTTP request protocol; boolean guarded formulas are then used to 
determine whether these HTTP requests violate the property or not. We note that, whereas the handler 
messages to the acceptor are observed directly (i.e., the output action in the outer recursive formula), the 
client HTTP messages received by the handler have to be observed indirectly through the return values 
(of the form {ok, }) of the invoked function do recv[] Instrumentation allowing a direct observation of 

4 Defined in module yaws with arity 3. 
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Acceptor 


Figure 3: Component-based Monitoring 


these actions is complicated by the fact that the client TCP messages are sent through functions from the 
inet Erlang library, which is part of the Erlang Virtual Machine kernel. 

3.2 Component-based Monitor Generation 

The monitors generated by detectEr are not monolithic, but consist of systems of concurrent (sub)monitors, 
each analysing different parts of a system under examination. For instance, in the case of property & 
detectEr first generates a submonitor that listens for the action AcceptorPid ! [handID, next,_} from 
the unassigned (free) handler waiting on the TCP port. Once this action is detected, it spawns a new sub¬ 
monitor to listen for the new handler, while it continues monitoring for the handler that is now assigned 
to a client. Thus, after two client connections are accepted, we end up with the configuration depicted 
in Fig. [3] (see K3I for details). Note that the submonitors need not communicate with each other since a 
violation detected by one submonitor immediately means that the global property is violated. 


4 Introducing Synchronous Instrumentation 

In Il23l . the monitors generated by detectEr are exclusively asynchronous, using the tracing mechanism 
offered by the EVM OTP platform libraries 041 which do not require instrumentation at source-code 
level; instead VM directives generate trace messages for specified execution events that are sent to a 
specially-designated tracer actor. However, as was argued in Sec.[l] there are various cases where safety 
properties may need to be monitored synchronously. 

There are a number of ways how one can layer synchronous monitoring atop of an asynchronous 
computational model. For instance, one can insert actual monitoring functionality within the sequential 
thread execution of each actor (in the style of |[38l [TT! [T5ll ) and then have the monitoring code (scattered 
across independently executing actors) synchronise, as required, in a choreographed setup 08II221 . In¬ 
stead, we opt for an orchestrated solution, whereby individually monitored actors are only instrumented 
to report monitored actions to a (conceptually) centralised monitoring setup that receives all reported 
actions and performs the necessary checking^] There are a number of reasons for choosing such a setup. 
First, the instrumentation code at the system side is minimal, leaving the instrumented code close to the 
original. Moreover, monitoring is consolidated into a group of concurrent actors that are separate from 
the monitored system, improving manageability (e.g., parts of the system may crash leaving the monitor 

5 Although conceptually centralised, the orchestrator monitor consists of independent, concurrent sub-monitors as discussed 
in Sec. 


Ip 























A. Francalanza & I. Cassar 


61 


System Monitor 


Event el occurs; 

ack(init_nonce) 

loop(Nonce) —> 
send_ack(Nonce), 

{EVT,Nonce2j = recv_event(), 
HasPatternMatched = handle(EVT), 

Get data dl from el; 
Block on nonce(el); 

event(dl, nonce 1) 


Event e2 occurs; 

Get data d2 from e2; 

ack(noncel) 

event(d2,nonce2) 

if(HasPatternMatched) —> 
loop(Nonce2); 
else —> 


Block on nonce(e2); 

ack(nonce2) 

send_ack(Nonce2) 

end. 


Figure 4: A high level description of the synchronous event monitoring protocol. 


(largely) unalfected). More importantly, it allows us to perform a like-with-like comparison with the 
existing asynchronous setup present in detectEr, thus obtaining a more precise comparison between the 
relative overheads of synchronous and asynchronous monitoring. 

Fig.[4]depicts the synchronisation protocol between the system instrumented code and the monitor for 
two monitored events; synchrony is achieved through handshakes over asynchronous messages between 
the two parties. The monitoring loop starts by sending an acknowledgement message to the system, 
signalling it to execute until it produces the next monitored event. When this point is reached, the 
instrumentation code at the system side extracts the necessary data associated with the event, sends it 
as a trace message to the monitor and pauses by blocking on an unpause-acknowledgement message 
from the monitor. Since, the (acknowledgement) asynchronous messages may get reordered in transit 
(potentially interfering with this protocol) the instrumentation code generates a unique nonce for every 
monitored event and sends it with the event data. In return, the next time the monitor sends back an 
acknowledgement message, it includes this unique nonce; this allows the instrumentation code to pattern- 
match for (and possibly read out-of-order) mailbox inputs containing this nonce, and unblock to the 
corresponding acknowledgement. The monitoring loop outlined in Fig. [4] corresponds to the pattern¬ 
matching functionality required by a necessity formula [a]<^. For instance, if the pattern is not matched, 
it acknowledges immediately to the system to continue executing and terminates the monitoring i.e., 
command send_ack(Nonce2). However, if the pattern matches, there is still a chance that a violation 
can be detected: it therefore delegates the acknowledgement (with the corresponding nonce) to the next 
part of the monitor (corresponding to tp in [rr] ip) i.e., command loop(Nonce2). Crucially, if ip = ff, 
the monitor does not send back the acknowledgement, blocking the offending system indefinitely while 
flagging the violation; in a runtime enforcement setup, the execution of a recovery procedure would 
substitute the violation flagging. 

4.1 Instrumentation through Aspect-Oriented Weaving 

The instrumentation was carried out by extending an Aspect Oriented Framework for Erlang f32l to add 
the necessary instrumentation in the system code; Il32l did not support aspects for output and input events. 
Our aspect-based instrumentation uses a purpose built Erlang module called advices.erl containing 
the 3 type of advices used by our AOP injections. For send and function call events, the AOP weaves 
before_advice advices reporting the event data to the monitor. For function returns, the AOP weaves 
after_advice advices at the source location where the function is invoked; in this case, after advices 
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are required since the return values are only known at that point. The weaving concerns mailbox reading 
events, performed through the recieve construct, was not as straightforward. Since a recieve may 
contain multiple pattern-matching guarded clauses 

recieve guard l -> expression l ; ... ; guard n -> expression end 

the AOP weaves upon.advice advice for each guarded expression (i.e., after each ->). At runtime, only 
one recieve guarded expression is triggered, at which point the necessary pattern-matched data of the 
event is known and can be reported to the monitor by the advice. 

4.2 Preliminary results 

We conducted a number of experiments to asses the relative overheads between synchronous and asyn¬ 
chronous monitoring. We synthesised numerous properties such as © for a Yaws server installation 
handling varying amounts of client connections and HTTP requests. The graphs obtained in Fig. [6] 
clearly show that synchronous monitoring incurs higher overheads than its asynchronous counterpart in 
terms of CPU Utilisation, memory consumption and the latencies it introduces; Fig. [6]shows substantial 
responsiveness degradation when handling typical loads of client requests. To rule out any gains obtained 
through the efficiencies of the OTP tracing platform, we also created our own version of asynchronous 
monitoring that uses aspect orientation (but without blocking the system); for certain measures, e.g., 
memory consumption, the overhead discrepancies were even larger (see Fig. [6]). 

5 Hybrid Instrumentation 

Despite the benefits of synchronous monitoring, the associated overheads obtained from our preliminary 
results are substantially higher so as to make it infeasible in practice. We therefore devise an alternative 
instrumentation strategy with the aim of guaranteeing timely violation detections while incurring lower 
overheads that are closer to those incurred by asynchronous instrumentation. The key insight is that, in 
order to attain timely detections, the instrumentation need not require the system to execute in lockstep 
with the monitor for every monitored event leading to the violation. Instead, (expensive) synchronous 
event monitoring can be limited to the final event preceeding a violation, letting the system execute 
in decoupled fashion otherwise. Intuitively, for the logic of Fig. [2] these final events and the necessity 
actions preceeding (directly or indirectly) a ff forumla. 

Example 5.1. Recall property ([5]) from Ex. |3.1| In order to synchronously detect a violation in this 
property, only action [y!z] needs to be synchronously monitored, since it precedes a ff forumla (in¬ 
terposed by a conjunction and a boolean guard). Action [server ? {succ,.x,yl] can be asynchronously 
monitored, without affecting the timeliness of detections. ■ 

5.1 Logic Extensions 

We extend the syntax of the logic introduced in Sec.[3]by two constructs: a synchronous false formula 
and a synchronous necessity formula with a semantics analogous to that of f f and [cr] ip resp. (see Fig. [2]) 

ip,fi £ FRM ::=... | sff (synchronousfalse) | [|a|]<^ (synchronous necessity) 

In the extended logic, formulas carry additional instrumentation information relating to how they need 
to be runtime-monitored: by default, all the monitoring is asynchronous, unless one specifies that a 
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System 


Event el occurs; 

Get data dl from el; 

event(dl,null) 

Event e2 occurs; 

Get data d2 from e2; 

event(d2,null) 

Event e3 occurs; 

event(d3,nonce 1) 

Get data d3 from e3; 


Block on nonce(e3); 

ack(noncel) 


Monitor 


loop(Nonce) —* 

if(Nonce + null ) —> 
send_ack(Nonce) 
end, 

jEVT,Nonce2} = recv_event(), 
HasPatternMatched = handle(EVT), 
if(HasPatternMatched) —> 
loop(Nonce2); 
else if(Nonce + null) —> 
send_ack(Nonce2) 
end. 


Figure 5: A high-level depiction of the hybrid monitoring protocol. 


violation is to be synchronously monitored, sff, of that a particular event needs to be synchronously 
monitored, [ | a | ]?B 


Example 5.2. We can refine property ([5]) as shown below, whereby we distinguish between two kinds 
of violations, i.e., return values z that are less than x + 1 and return values that are greater than x+ 1, and 
require the latter violations to be detected in synchronous fashion. 

(bool(z = x+l) =>X) ^ 


max(X, [server ?{succ,x,yj] [v ! z] 


& (bool(z < x+1) =>££) 

& (bool (z > x+ 1) => sff) 




The new monitor synthesis algorithm requires a pre-processing phase to determine which events are 
to be synchronously monitored in order to implement a synchronous fail. For instance, formulas [a] sff 
and [ | a | ] ff are both monitored in the same way, in fact the pre-processing phase encodes the former 
into the latter. In general, however, determining which actions to synchronously monitor for implement¬ 
ing a synchronous fail is not as straightforward, since the sff and the first necessity formula preceding 
this sff may be interposed by intermediate formulas such as conjunctions and boolean guards (as in 
the property of Ex. |5.2[ ). In such a case the compiler inspects each boolean guard and checks whether 
there exists at least one boolean guard that leads directly to a synchronous fail, i.e., boolb =>sff, 
and if so, the action specified in the necessity is synchronously monitored. In hybrid monitoring, both 
synchronous and asynchronous event monitoring require code instrumentation, as shown in Fig. [5] Asyn¬ 
chronous events inject advice functions sending a monitor message containing the event details and a 
null nonce, without blocking the system; upon receiving a null nonce, the monitor determines that it does 
not need to send an acknowledgment back to the system. Synchronous actions are implemented as before 
(see Sec. [4]), wher e fresh nonces indicates the monitor that it needs to acknowledge back. 


6 Evaluation 


Using the extended syntax, we reformulate the security properties used for the evaluation of Sec. 4.2 and 


require violation detections to be synchronous i.e., using sff instead of ff. For instance, from ([6]) we 


6 Synchronous event monitoring can be used to engineer synchronisation points during periods where the system is not 
required to be immediately responsive at which point a monitor is allowed to catch up with a system execution, as in 03- 
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50 100 200 500 1000 2000 

No. of Client Requests 




Async w/ Tracing * Async w/ Instrumentation —Hybrid —Synchronous * Baseline 


Figure 6: These graphs denote the average cpu time, memory utilization and response time when moni¬ 
toring the system using different monitoring approaches. 
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obtain the property below. 
max(X, 

[ | AcceptorPid ! {handlerPid, next, _} | ] 

(x & max(F, [ret (liandlerPid, {yaws,dojrecv,B,{ok,{http_req,GET,_,_}}})] 

[ret ( handler Pid, {yaws, do_recv, 3, {ok, /71}}) ] 

: (V) 

[ret (.liandlerPid, {yaws, do_recv, 3, {ok,/i6}}) ] 

' (bool(isMalicious(hl,h2,h3,h4,h5,h6))-> sff) ' 

& (bool (-i isMalicious(h 1, hi, h3,h4, h5, h6)) -> 

. [ret (/jmtJ/erP/r/, {yaws, dojrecv, 3,{ok,http_eoh{})] F)) , 

)» 

We measure the respective overheads resulting from the hybrid instrumentations over Yaws for varying 
client loads in terms of (/) the average CPU utilization required; (ii) the memory overheads per Yaws 
client request; and (Hi) the average time taken for the (monitored) Yaws server to respond to batches of 
simultaneous client request]^] The experiments were carried out on an Intel Core 2 Duo T6600 processor 
with 4GB of RAM, running Microsoft Windows 7 andEVM version R16B03. For each property and each 
client load, we take three sets of readings and then average them out. Since results do not show substantial 
variations when different properties were considered, we again average them across all properties and 
compile them in the graphs shown in Fig. [6} 

The results show that the hybrid instrumentation yielded CPU utilisation and memory overheads that 
are substantially lower than those incurred by a synchronous instrumentation, comparable to those of 
asynchronous monitoring with code injections. The second graph even shows that the memory utilisation 
for both of these instrumentations is less than that for asynchronous monitoring performed through the 
EVM tracing of Il23ll . In the case of response-time latencies, where synchronous monitoring fared the 
worst (on average 30% higher overheads than its asynchronous counterpart) a hybrid approach managed 
to lower response times to overheads that are about 15% higher than asynchronous monitoring; see Fig. [6] 
bottom graph. 

7 Conclusion 

We studied various monitoring techniques for actor-based frameworks in the context of Erlang and inte¬ 
grated them within a tool for runtime verification of actor systems. Our contributions are: 

• A novel hybrid instrumentation technique minimising the amount of (expensive) synchronous 
monitor instrumentations while still guaranteeing timely violation detections. 

• An extension to detectEr, an RV tool for Erlang (actor-based) programs, that allows the verifier to 
control which violations to monitor synchronously and asynchronously within the same property. 

• A case study demonstrating the applicability of the technique and tool to monitor safety properties 
for Yaws, a concurrent web-server written in Erlang. 

• A systematic assessment of the relative overheads incurred by different instrumentation techniques 
within an actor setting. 

7 In the extended syntax, properties that are exclusively defined in terms of synchronous necessity formulas, [ | a | ] ip, yield 
synchronous monitor instrumentations that are identical to those discussed in Sec.|4] 
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Related Work: Several verification and modeling tools Il39ll26l I2l 36, 38] for actor-based component 
systems already exist. Rebeca Il39l |2l is an actor-based modeling language designed with the aim of 
bridging the gap between formal verification approaches and real applications. It provides conversions 
to renowned model checkers like SMV and Promela, and extends the model with timing constraints; 
timed-rebeca models have also been translated into Erlang. McErlang If26l is a model-checker specifi¬ 
cally targeting Erlang code; it uses a superset of our logic. We are however unaware of any extensions 
of these tools to RV. Apart from detectEr |(23j, other RV tools for actor based system exist. eLarva fT4l 
is another Erlang monitoring tool that uses the EVM tracing mechanism to perform asynchronous mon¬ 
itoring; no facility for synchronous monitoring is provided. In [(38]], Sen et at. explore a decentralized 
(orchestrated) monitoring approach as a way to reduce the communication overheads that arc usually 
caused by a centralized approach and implement it in terms of an actor-based tool called DiAna. Al¬ 
though their investigation is orthogonal to ours, it would be interesting to integrate this study within ours 
and systematically evaluate whether choreographed monitor arrangements yield further overhead gains. 

By and large, most widely used online RV tools employ synchronous instrumentation OTl fl3l UT1 
■19] (8]. There is also a substantial body of work commonly referred to as asynchronous RV Ifl8l fl7 . 
M- However, the latter tools and algorithms assume completed traces, generated by complete program 
executions and stored in a database or a log file; as explained in the Introduction, we term these bodies 
of work as offline, and their aims are considerably different from the work presented here. There exist 
a few tools offering both synchronous and asynchronous monitoring, such as MOP ifTOl fTTTl and JPAX 
||28l [37 1 . Crucially, however, they do not provide the fine grained facility of supporting both synchronous 
and asynchronous monitoring at the same time, or switching between the two modes at runtime. 

Rosu et al. 1371 make a similar distinction to ours between offline, synchronous online and asyn¬ 
chronous online monitoring. However, their definitions of synchrony and asynchrony deals with the 
timeliness of detections. By contrast we focus on how instrumentation is carried out and show how 
hybrid instrumentation can be used to obtain timely detections for certain properties; this amounts to 
synchronous monitoring in lf37l . A closer work to ours is lfT6l : they allow a decoupling between the 
system and monitor executions but provide explicit mechanisms for pausing the system while the lag¬ 
ging monitor execution catches up. In our case, this mechanism is handled implicitly when switching 
from asynchronous to synchronous monitoring. In lfT6l they do not provide an implementation of their 
constructs and do not asses the relative overhead costs incurred by different instrumentation strategies. 

Talcott et al. in HU compare and contrast three coordination models for actors which cover a wide 
spectrum of communication mechanisms and coordination strategies. The comparison focusses on the 
level of expressivity of each model, the level of maturity, the level of abstraction and the way user 
definable coordination behavior is provided. One of the analysed coordination models, ie. the Reo 
model, is a channel based language in which channels may be either synchronous or asynchronous. This 
model resembles the way our hybrid monitoring protocol interacts with the monitored system. In fact our 
monitoring language constructs [a] and [ | a | ], seem analogous to the asynchronous and synchronous 
channels in the Reo model. 


Future Work: Our hybrid instrumentation technique can be extended to other inherently asynchronous 
computational models, such as monitoring for distributed systems (22ll . This would require us to con¬ 
sider additional aspects such as the handling of multiple, partially-ordered traces and the use of alterna¬ 
tive monitor organisations such as monitor choreographies. Our techniques can also be extended with 
enforcement mechanism iSTI . facilitated by the fact that corrective action can be earned out as soon as 
the system performs the violation. This would be worth exploring in the context of Erlang and detectEr. 
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